Dynamic Clustering of Evolving Streams with a Single Pass
نویسنده
چکیده
Stream data is common in many applications, e.g., stock quotes, merchandize sales record, system logs, etc.. It is of great importance to analyze these stream data. As one of the most commonly used techniques, clustering on streams can help to detect and monitor correlations among streams. Due to the unique nature of streaming data, direct application of most existing clustering algorithms fails to deliver efficient results. In this project, we introduce a novel model of stream cluster which employs a weighted distance measure. In addition, we device a novel efficient algorithm which can effectively discover all stream clusters.
منابع مشابه
Mining Evolving Web Clickstreams with Explicit Retrieval Similarity Measures
Data on the Web is noisy, huge, and dynamic. This poses enormous challenges to most data mining techniques that try to extract patterns from this data. While scalable data mining methods are expected to cope with the size challenge, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stoppages and reconfigurations is still an open challenge. This dynam...
متن کاملSingle-Pass Algorithms for Mining Frequency Change Patterns with Limited Space in Evolving Append-Only and Dynamic Transaction Data Streams
In this paper, we propose an online single-pass algorithm MFC-append (Mining Frequency Change patterns in append-only data streams) for online mining frequent frequency change items in continuous append-only data streams. An online space-efficient data structure called ChangeSketch is developed for providing fast response time to compute dynamic frequency changes between data streams. A modifie...
متن کاملA framework for mining evolving trends in Web data streams using dynamic learning and retrospective validation
The expanding and dynamic nature of the Web poses enormous challenges to most data mining techniques that try to extract patterns from Web data, such as Web usage and Web content. While scalable data mining methods are expected to cope with the size challenge, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stoppages and reconfigurations is still a...
متن کاملTECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model
Artificial Immune System (AIS) models hold many promises in the field of unsupervised learning. However, existing models are not scalable, which makes them of limited use in data mining. We propose a new AIS based clustering approach (TECNO-STREAMS) that addresses the weaknesses of current AIS models. Compared to existing AIS based techniques, our approach exhibits superior learning abilities, ...
متن کاملRobust Clustering for Tracking Noisy Evolving Data Streams
We present a new approach for tracking evolving and noisy data streams by estimating clusters based on density, while taking into account the possibility of the presence of an unknown amount of outliers, the emergence of new patterns, and the forgetting of old patterns. keywords: evolving data streams, robust clustering, dynamic clustering, stream clustering, scalable clustering
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003